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f ABSTRACT ' ^ . ' 

\ The use of Stanfo.rd Achievemeat Tests (SATs) with 

shearing impaired students is considered. Background of the test is 
■given, \and the SATtHI (SAT for the Hearing Impaired) is discussed in 
|;terms of its content, item wording, and norms, imong six specific 
(recommendations made are to use the SAT-HI because it minimizes floor 
and ceiling effects, standardizes administration and provides norms 
phased on\, hearing impaired students: examine the content of the SAT-HI 

determine its relevance to the curr-iculum: do not interpret the 
iiscores of. hearing impaired students as precise estimates of -ability; 
and use rkw scores, when applicable, or scaled scores to assess' 
student gi^owth. (CL) 
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A RealMtic Lbok dt the Stanford Achievement Test: 
What does it niean? 



/ • ' ■ v 

J I • I 

Like other standardized and normed colLections of achievement test 

batteries, the ^anfprd Achievement Test (SAT) is often used to evaluate 

the growth and /perf o.rraance of individual students and groups of students. 

For students yln regular public elementary and junior high school classrooms, 

the SAT has/proven to be quite meritorious. The. reliabilities of the various 

!' ■ ' / ■ 

• -. . -' / ■' • & • .. 

subtests ar^suff iciently high arid the content of the different level exam-» 
inations^adequately parallels the cufticulum taught in the corresponding 
. grade levels. 

In using SAT or another achievement test designed for regular classroom 
students with a special population, such as with hearing impaired adolescents, 
the test user assumes responsibility for the measure being adequate for the 
tasks and for the proper use of test results. In this paper, several charac- 

; ' teristics of achievement tests in general, and the SAT in particular, are 

examined. Specifically, test content, item wording, and norms are discussed 

* as they relate to evaluating hearing impaired students. 

Bockground 

Since the first edition in 1923, the SAT has not been a single test, . 
but rather a collection of achievement test batteries. The 1973 edition 
V contains six carefully developed batteries, each battery consisting of several 

s\ subtests to assess skills in areas such as reading comprehension, mathematics 

concepts, mathema^tics conputatioh, science, and social studdes. The Primary 
Level I Battery is designed for public school students in grades 1.5 to 
■ : 2.^4; Primary Level II, grades 2. 5 to 3.4; Primary Level III, grades 3.5 
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to Intermediate Level I, grades 4.5 to 5.4; Intermediate. Level II, , 

grades 5.5 to 6.9; and Advanced, grades 7.0 to 9.5. 

In the development of the SAT, objectives matching the core curriculum 
of each grade in elementary schools were developed after a careful survey 
of existir\g curricula. These objectives served as a guide in the develop?- 
men^ of the publishers* textbook series and in writing items for the SAT 
o(see SAT Manual, Part V, p. 12-13). 

To insure that the item woirding was as appropriate as possible, twice 
as many -items as needed were developed and administered to an, item tryout 
sample. This pre-test sample was selected so as to closely match the United 
States population in terms of percentages of students by community size, 
percentage of students by geographic region, median family. income, median 
years of parental schooling, percentage of Black Americans, and on other 
variables (see Stanford Research Report #3) . ' Based on the item tryout,. 
item statistics were computed and the best items were retained for the final' 
version. In the' Primary II battery, for example, 2,565 items were piloted 
and 1,326 items were retained for the three final forms. 

After items were selected for the. final forms, norms, such as grade 
equivalent scores and percemiles, were developed. These norms provide 
a means for comparing the performance of one student or group of . students 
with that of some particular reference group. Wliile several reference groups 
are possible., the SAT, like most other standardized achievement test batteries, 
uses a representative sample of the United States school children population 

as its benchmark. 

In context, these^norms can have meaning for moist school systems. 
This is true because/4he norms describe one reality , i. e. , the typical, per- 
formance of the country's students on. these items. However, the context 



is restrictive. It assumes adequate measuremeni: , relevance of the items, 
and appropriate interpretation. These limitations are extremely important 
when a test is used as an assessment device*" In a program for hearing 
impaired students. ' 

A 19^ national survey by the Office of Demographic Studies showed 

that the SAT was the most popular standardized achievement test among educators 

« ■ 

. of the deaf. Of the 29,023 hearing impaired students to receive any stan- 
dardized achievement test during the 1972-73 school year, approximately 
77% or 22,292 students would be taking the SAT (Buchanan, 1973). Because 
of its popularity, the Office of Demographic Studies decided to facilitate 
proper use by compiling a special edition of the 1973 SAT, the Stanford 
Achievement Test-Hearing Impaired, version (SAT-HI). While the, original items 
and subtests were retained, the level of the SAT-HI in which the subtests 
appeared was changed. Thus, for example, the Level II battery of the SAT-HI 
contains the Vocabulary and Communication Comprehension subtests of the SAT 
Primafy Level I battery and the Mathematics Computation and Spelling subtests 
of the SAT Primary Level III battery. The other subtests are those which 

appear in the SAT Primary II battery. This technical modification reduces 

* I. 

the number of students scoring at the extreme ends of the subtests (floor and 

ceiling effects) and, when coupled with standardized administration and the 

■ ' ■ ' ' ■ - ■ • • . ■ '■ \ ... 
use. of special norms for hearing impaired students, provides for the' more 

accurate assessment of student abilities. With these improvements, the SAT- 
HI is preferred to the SAT for use with hearing impaired students. However-, 

. ■ " ' 

since the item wording and content are unaltered, and since norms developed 

'» ■ ■ ■ 

■ . ■ • ■■ . ■ ' > 

for hearing students are still used, the SAT-HI is not to be viewed as or 

• ' . • tf • " ■ 

interpreted like'a test entirely designed, standardized, and normed for a 
hearing* impaired adolescent population. ^ 



' ■ ... 

Content / 

The content of test items has been identified as one of the most Important 

aspects Is selecting . and judging tests - (Hoepfner, 1977) v' It Is the content Of 

the Items that determines which composite skill is being assessed by a parti- 

\ . ^ .... 

cular instrument. Several tests may measure reading ability; however , how 

reading ability is. defined and operationalized can vary greatly from test to 

test. For example, the second grade level of the Sequential Test of Educa-' 

tional Progress (STEP) contains a large; number of items assessing phonetic . 

word attack skills . The .SAT, and SAT-Ht Level I reading-tests contain few 

such itemsX On the other hand, the low levels, of the SAT--HI contain a number 

■ ^\ ■ ■ ' ■ ■ ■ ' ^ 

of items assessing students^ abilities to infer meaning of words from context 

.X.'' ■ . ■ ■ ■ • • - • • 

and the STEP contains, none. 

/ As an example of how^a particular set of abilities is defined. Table 

' ■ ' ' ■ . . '"^^ . ■ ■ ' . ' • • ■ 

1" 'contains a breakdown of the^c.ontent gauged by , the SAT-HI Math Computation 

' . ■ ■ '-* ' - . .. . ■ ■ ■ . \ ■ ' ' • 

subtests at the six different levels, these content classifications help 

the test user index the subtests' relevancy. If one is teaching high school 

, ,*> . ■ 

geometry and a student takes the Level III .examination which emphasizes 



knowledge of the primary facts and the basic addition, subtraction, multi- 
plication and division 'algorithms; th^n , t>e students ' scores do not^eflect: 
their classroom endeavors. Use' of such a test to .judge mastery of the curri- 
culum would be faulty.. However, using itvto gauge ability in these basic - 

skills, which may not be covered by the school's curriculum, does provide 

' . • • ■ " 'a ■ ■ ■ ' ■ . ■ 

some useful, although limited, feedback, / . . . ^ 

In addition J:o the content covered by a subtest, a test user should 

consider the proportion, of items within each content classification within 

a subtest. For examp A, the SAT-HI Math Computation Level I subtest emphajtlzes 
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Table i\ 

Percentage of Items within the Slx^evels of the SAT-HI 
Matheinatics Computation Subtests by Content Area 



Item Grouping 



SAT-HI Level X 

\ y '■ 

3 5 



Addition and . 

subtraction facts * 


41% V 




\^ 

■ \ 
. \ 






v 


. Mathematical sentences 


15% 












Verbal problems 


44% 






/ 






Knowledge of primary facts 


66% 


45% 


5 3% ''^ 


42% 


"42% 


■ ■' ' \ 


Addition and subtraction 

alogrithms 

' . . ■ ■ \ ^ 


t. , 17-%' ' 


22%" 


" 13% 


11% 


11% 




Multiplication and division 
alogrithms 


17% 


33% . 


25% 






- ■ . ■ 


Common fractions 






9% 


u% 


11% 




Other .operational models 


• . .. .. 






27% 

o 


27% ^ 


r ' ' • ?' 
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simple verbal problems and addition and subtraction facts* The same subtest 
at Level III emphasizes all the basic operations (see Table 1). While the 
subtests have the same name, they really assess d'ifferent abilities and 
scores on the two tests do nob necessarily represent the same ability. There- 
fore, caution must be exertt d in comparing, students taking tests at different 
levels. A student 'who obtains a grade equivalent score 2;8 at the Level * 
II exam will not necessarily receive a 2.8 on the -Level III exam. 

In practice, the curriculum of most school districts, and especially- 
the curriculum in programs for the hearing iihpalrfed, is not fully reflected 
in the content of a standardized achievenrent test. This does not mean v 
. that scores on standardized achievement^ tests are worthless; only that they 
must be evaluated in perspective. In a 1 school for the deaf, they provide 

' o 

an index of how well hearing impaired stuciehts perform oc\ certain tasks — , 
tasks which are representative of the basic skills taught the stan4ard, " 
albeit normal hearing*, school children population at certain elementary 
grade levels. j - s ^ 

Since part.*?, as opposed to all, of a subtest may be relevant to the 
curripulum efforts or achievement desires, attention might be given to the 
results on a particular content classification as opposed. to total scores. 
This is. referred to as objectJLve.^ref erencing. The SAT was specifically 
"designed for dual interpretation in the normed-referenced and objective- 
referenced modes'"^ (SAT Manyal, Part V, p. 12), and Item Analysis Reports 
are available as part of the publisher's scoring service. A school for the 
deaf may be particularly interested in having its students capable of per- ;r 
forming the basic addition^ subtraction, multiplication and division operations 
and not particularly concerned with fraction operations. By determining 
the school's average on the appropriate items within, the Level IV, V, and 



VI Mathematics' Computation subtest; the ted^ ustr is in' a position to state 
, whether students taking the exam nre performing satisfactorily In this 
area of Interest. Those wishing to' capitaT^ze oh. this beneficial analytical j 
method are referred for additional 'information to the Stanford. Research Report 
#10 and to the SAT Manual Part III: Teacher •s Guide for Interpreting. / 

. lUm Wording ^ ^ ^ 

■ ' * / / -ct 

" . " ■ ■ .''^ . ■ ■ ■ ■ / ■■ /f 

Items are worded to gauge skills in particular ar^s. The' wording of 

the item, however, may be such that utidesired skillS/^are^fuiged. For example, 

an item may be designed to gauge mathematics ability ,^ut because of the 

wording, it may largely assess reading ability. This is a major problem 

when tests are used with minority students and has led to , much of the work 

* . ■ • ■ '* * * • 

in test' and item bias. . - . • 

Several linguistic structure8~~have been-l~d'e~nrt as^caiising, undo 

difficulty for hearing Impaired students^ (Rud'ner, 1978). These include 

conditionals .(if , then), inferentials (could, should) comparatives (greater 

. ■ *• . y ■ . ■ ■ *• ■ . :■■ 

.than^ less than), negations (not, without), and low information pronouns 

' ' . ■ ■ • / : , .. ^r-- ' ' • „■;. 

(it, something). The SAT-HI, ^particularly the vocabul^r^sub tests, contains • - 

items that incorporate one/op more of these structures. Consequently, the 

results on various subtests may not always reflect the intended skills and 



scores qan be spuriously ^Low. 

■ ■ ' . /■ ■ : ■ ' \r' ■ \ :■■ ^ ■ ■ ■ 

Because heairing Imputed students taking the SAT-HI tend to be oilier. . 
than hearing students/' taking the SAT, hearing impaired students are able 



to draw from a larger repertoire of experience in responding to the items. 
Thus", other item wordings/ can favor hearing impaired examinees and spuriously 



'raise their scores. 



Effect of Item Wording on Tott .Accuracy 

Perfect test iteras—items which, for a particular age group, measure 
only an intended skill — are difficult, .if 'not Impossible, to develop, .The 
item tryout procedure will identify the Items which are best *f or a given 

population, but there still will be errors in measurement. These errors will 

■ ■ • ... ' \ ' 

be increased when special populations are Used and Increase^ even more .with 

age differences between the special population and the standardization sample 

* I' • 

Measurement error can be gauged in several ways. The reliability 
coefficient indexes the expected consistency of test results. The higher' 
th& reliability the greater is the expected consistency. In comparing 
the SAT subtests used with hearing elementary school children against the 
SAT-HI subtests used with hearing imiiaired adolescents, the reliabilities 

o • " • 

are consistently higher for hearing .children. The reliabilities on the 
Level II reading comprehension subtest, fqr example, is ,95 fox; hearing 
examinees and .83 for hearing Impaired examinees. This differential relia- 
bility means that' the scores for he^dring impaired students contain more * 

..... ■ ' • "5 

errors than the scores- for hearing students. 

" ■ ■ . ■ * . ^ ■ 

► The reliability coefficient Is a useful statistic for evaluating a 

test; however, it is not directly applicable in interpreting test result^. 

A more useful statistic for this purpose is the Standard Error of Measurement 

■tf " ' ' • ' ' , 

"(SEM), which provides an estimate of .the variation of the amount of nxror 

in the test scores for a given population. The larger thfe SEM, the larger " 

the interval in which one is confident an examinee's true ability lies. 

Table 2 outlines the Standard Errors of Measurement based on hearing 

■ ■ ■ . ..".■■■■■■•<> 

impaired examir..? .^ for the subtests by level of the SAT-HI, expressed in ' 
terms of raw scores. Considering that most subtests contain about 50 items 



Table 2 

Raw Score Standard Errors of Measurement for the Subtests 
' ' of the SAT-HI by Level 



SAT-HI Level 



Sub'test Area 1 2 • • 3 4 5 6 



Vocabulary 


2.8 


2.8 ' 


2.8 


3.0 


■ 3.1 


3.1 


Reading A ^ 


2.9 


2.7 










Read Ins B 


2.9 


3 . L' 








^ ' Reading Comprehension 


4.1 


4.1 


3.6 




3.6 


3.6 


\ Word Study Skills 


3.4 


3.5 


3.1 


3.0 


2.9 




Math Concepts 


2.4 


2.6 


2.5 


2.7 


2i.7 


2.6 


Math Computation 
Math Application 


2.4 


2.4 
2.3 


2.6 
2.2 


2.8 
2.5 


2.9 
2.6 

« 


2.8 
2.6 


^ Spelling 




2.8 


2.9 


3.3 


3.5 


3.1 


^- \. ^ Language 






3.2 


4.0 


4.1 


3.9 


Social Science 




2.3_ 


2.9 


3.5 


3.3 


3.4^ 


Science ' 




2.4 


2.9' 


3.6 


3.6 


3.5 


Communication CoSi^ehenslon v 


2.4 


2.4'.' 


3.2 


3.2 


32. 





.Adapted with permission from Jensema, Trybus /& Schildroth (In press) . 



these SEMs are fairly large. If one were using grade equivalent scores » one 
SEM would correspond to about three-tenths of a grade equivalent for the 
median student and more for the higher and lower ability students. Thus, 

scores' on the SAT--HI, like any other test, are not . to be taken as precise 

" . ■ '\ ■ ■■ • 

estimates of ability, , 



Norms 



Norms provide a frame of reference In Interpreting test results. Student 
performance can be compared to national averages by way of grade equivalent — 
scores and percentiles. Student gain over time can be gauged by differences 
In-^tliese norm scores, or more appropriately by differences in scaled scores 
which are specifically designed for this purpose. While norms based on 
hearing elementary school students and on hearing impaired students are 
available, the latter set is most relevant and meaningful' to ^schools for 
the deaf. 

In developing and using norms, careful consideration needs to be made 
with regard to the reference population (Angoff, 1971). This was done by 
the test publishers ,who went to great lengths^o insure that the standardl- 
zatlon sample closely matched. the average United^ States population (see 
Stanford Research Report #3)^ Similarly, the Office of Demographic Studies 
carefully delineated \ts sample in developing special norms for hearing 
Impaired 'stui^ents (seeTtybus and Karchmer, 1977) .- Thus the test user need 
not be particularly concerned with|the representativeness of the various 
norms. However, recognition of their limitations is essential for proper 
use and interpretat;;^on. ■' \ > 

Perhaps one of the most misunderstood type of norm is the grade equivalent 
score (GES). This norm, which is available only with* hearing students as 



the xefArence ^^^^ is inteii(fed to prbvide for a method df/d^ 

pupil perfonna:nce in terms of median public school grade* level performance; 
.Suppose' data-ijere collected from representative samples of. chi^ 
^t all grade levels in' the, seventh month of the school year . A table could 
be developed for the seventh month of each grade which converts raw' scores 
to their percentile equivalents. Similarly, the median score at each grade-., 
level could be converted to a grade equivalent score of the. form grade, lever 
plus .7. The mediai,H score for these first graders in the seventh month, of . :; 
the iichool .year would convert • to a GES of 1.7, for second graders, 2.7, and 

■so- on, • ■ ■ ' ; ■ " '\ . •!■.:■• 

Grade' equivalent scores are given by the publishers for. points other 
than 1,7,. 2.7, 3.7, etc. These GES's were derived by extrapolating between y/^^ 
Fall and Spring norm results,. Psychometricians, however, have clearly pointed 
6ut that extrapolation, of norms is .statistically unsound and .lends the norms 
to serious .misinterpretation (e.g., Talliriadge^ 1?77) . : ' , 

One obvious and serious consequence of iextra'po'lating GES is reflecte.d 

-in- the "anomalies of convertlng-GES's" on the 1964 -edition of the SAT to GES 
on the 1973 edition of .-the same tesf. Table 3^; taken from Research Report 

' #5 of the SAT, shows. the corresponding. grade equivalent scores oh the two 
Intermedlage il (^^^^ tests. . For example,^ a GES of ,3.7 on the 1964 

Speiling test would correspond to 4. 3 on* the 1973 Spelling -test. That^ ls, 



b" a*"mging from the 1964 to the 1973 version of the yAT,; a student .would. 
■ ; eviolit 6 GES growth without any correspojding^^ 



1964 SAT Intermediate II GES' versus the 1973 Intermediate II GES 



1973 
SAT 



Vord 
Mean 



Voc. 



Para. 
Meaning 

Reading 
Comp. 



Spell 



Spell 



Arlth. 
Comp; 

Math. 
Comp. 



Arith.^ 
Comp, 

Math^ 
Concepts 



. 5.5 
5.4 

^5.3 ■ 

5.1 
5.0 

^4.7: 

^; 4:7 : 

••.;;4.5, •■ 
" 4.4- 

4.3 • 

•4.2 
. 4;1 



4.9 




4-5 


M ■ 


5.4 


4.8 , 


4.6 


4.4. 


4.4 




4.7 . 


■ 






5.3 


4.6 


!: 4.4 


4.3 


.4.3 


•5.2 






4.2 

. ■ ■ 


4.2 


5.1 
5.0 


4.4 


4.2 


4.1. 


4.1 


4.9 




4.1 


4.0. -J 




4.8 


4.2 

.. . / 

4.1- 


4.0' 


■"■ .. / 
^3;^9 A 


3.9 


4.7 
4.6 






^ ■ / 0 






4.0 

■ * 


■ ■S'-s .'■ 


. .■3:8;/ ' ' ■ 




4.5, 


3.9 


' 3.8 - 




3.8 ' • 


4.4 , 


•1 - -n • 


3.7 ■ 






.4.3. 


3.8. 


; 3.^ . 


■ ■• .•/ 


3.'7 ■ , ■ ■■v 


4.1; 






. . . 3'.6 • ' 


3.6/ , . ■. 


4.0 



in some instances month- to-month gains- can be expected without any increase 
in ability, and the learning curve projected from the GES is unrealistic. 

As long as testing Is conducted at the seventh monthT, tl?e percentile 
scores,' unlike the GES, would have clear meaning, A student's performance Is^ 
defined by the percentile as the percent of students In the reference popula- 
tion scoring less than he on that level of the »^AT in the seventh month. It 
is for this reason that percentiles are given f.or a specific time of the 
school year. If testing is conducted at a different time of the' year, in- 
terpretation of the percentile score is unclear and tenuous at best. It 
cannot be determined, for example, whether a student taking the SAT in the 
second month and scoring in the 45th^ percentile is above or belov average 
yi\,^^T^si^tt to his grade level peers. 

While the GES and percentiles can lend themselves to misinterpretation 
when used with hearing elementary school students, they'' do provide meaningful . 
anchors wj;ien used properly,, The> test 'scores of a third ^rade student in a - 
public school taking the Level 11- battery can be mapped to a meaningful per- 
^ centile! score and a mean IngfuL GES, Prpviding' that the\ test, was administered 
at the same time of the school year in whipb :the normirig" was conducted, .the 

■ ' * ■ - 'J ' • , ' - ■ ■ . ■ , 

. . s student's percentile spore can provide an index of the relative' standing 

■' ■ ' . I ■ ' • -s:'- • * " ^ . ■ •fl'' ■ 

■ of this student with respect to other -xthird graders. For example, had she ' ' 
scored in" the 80th percentile, her score can be- interpreted fs being better • „ 
than 80 and lower than 20 out of evetyJlOO third grade students taking the . 

-erIc"..' ' ■■ ■ "•• ' ' " ' .: . '-^ • ■ 

■mamam tcst ,at the same time of the year. .The GES can index whether she is performing, 



of other third grade programs in other schodls^i again , providing a .gross 



index of podr, typical, or above averapT 

Anchored to the populations of grade level public elementary school 
children, the GES's and percentile scores are of limited value when used 
to describe the performance of hearing impaired adolescents, the hearing 
impaired adolescent is /older and is not receiving the same curriculum as 
the hearing student. The feedback provided by a test which says, that a 
high school aiged hearing Impaired student does better than 40% of the second 



grade public 



school children* on a test which' emphasizes a different curriculum 



is almost irjrelevant. To say that a student had a GES of 3.5 basically defies 
interpretation. The scale is not relevant to the s.tuderit or the efforts 
of the scholastic program. Gross comparison of the performance of hearing ^ 
Impaired adolescents to the public elementary school curriculum r:^.8leadlng 
enough; the use of monthly equivalents giving the false impression greater 
accuracy, compounds ' the matter, especially if one considers that there may 
be at least tlree months error In that decimal. ^- 

A total abandonment of public school grade level percentiles and grade 
equivalent scoiijes would -eliminate much o| the misuse of test scores and / 
probably serve -to enhance thfevUtility of the SAl-HI. One might argue thiat 
•a comparison with hear;lng students is,, essential since it indexes the potential 



,of deaf students.\ However, the, fact that a test de&igned.^for second grade 
public school .children is administered to high school aged liearing Impaired 
Students already provides substantial comparative information. - . , 




of studehtsf In U.S. programs for the hearing impaired/ these norms allow 
for meaningful descriptions of how well a hearing impaired student or group 



■I 

of^hear4ng~^lmp.ai performed on the SAT-HI. 



^ Percentiles for hearing Impaired students based on test levels are 

printed on the score reports for those who use the publisher's computer 
scoring service* These percentiles provide for comparison of a hearing 
Impaired child with other hearing impaired children who were tested at the 
.same difficulty and on the same content. They do not, however, allow ^ for 
comparisons across levels. 

Percentiles for hearing impaired students based on age are available 
"through the Office of Demographic Studies. These percentiles allow one ° 
to determine how well an individual student performed in comparison to a 
national sample of hearing impaired children of the same age, regardless 
of the level of the SAT-HI. In using these percentiles, one must remember 
that thie , content does differ across the levels of the .SAT-HI, so the comparison 
will not always be exact. , \ 

Like other percentiles, these percentiles based on hearing Impaired 
' examinees onlyvhave meaning if testing is ^ c th^; same time of 

the year the normlng sample took the'exaininati,pn. Thus, if one- wants to 
meanlrigfully describe, student performance;, tiBsting shoiild .be. conducted during' 

■ ' ■ ■ ■ ' ' .: ' ' \ ■ ' ; ' " . ' • ' ' ' ■ ■' ■ ■ 

■"ihe Spring.'' ' - 'j ' • ^ : 



One major purpose of achievement tests i&> to determine whether students 
El^Ca^® ^^^'l^l^^^^ ^^®" skllU time;. While the SAT-HI may not be amenable 



desttlBe student performance In ^erms of relatly;e standing withiti a. group 
of students. Over time, the group will have ^progressed, but the student's / 
standing may remain the same. Occasionally, researchers use the ,GES as 
, ^ relative Index of ability by comparing changes in GES. However, this 
is not good practice, since the fiJiS~rs~no^t~on-^^ 

• fea-ence^ between a GES -of 3.2 and 3.8 is not the same as the difference between 
a GES of 6.2 and -6.8. Further, with the error inherent in extrapolating - 
values, the error in the gain scores becomes quite large. 

If one is interested in student gains, two psychometrically sound options 
are available. The first preference is to use raw scores without any trans- 
formation. However, this :is only possible when the same level o? the. SAT-HI 
^.s given during both administrations. An alternate choice, applicable re- 
gardless of the examination level taken, is to use scaled scores. 

^ Scaled scores have the unique advantage of providing approximately 
equal- units on a contl^mous, scale." • The* scale has two reference points,* scaled 

' ^ ' . ,^ - ' . . . < ■ . * 

scores of 132 and 182 correspond to a«GES of '3.2 and 8.2, and each unit 
is intended to represent the average monthly gain. over a five year period. 

■ • > <»\ ' ' 't • 

4 . ■' r , , . V J 

k \ *l * ■. . ' • ■ 

However, the absolute meaning of the units is not of interest when assessing 
, gain; By describing location Ori a cpintlnuous, equal-interyal SQale, scaled 
scores' overcome sope of the difficulties of using perc,ent lies and grade 
equivalent, scores to assesV gain. A difference of five scaled score points,- 
for example, is the same -riBgardless .of where it occurs on the scale. 



; prlately worded and norms which provide meaningful Indices of growth and 
^^^^^^^^ available. The test has beein shown to be both valid and 

reliable for this population. 

The content, wording and norms of the SAT were re-examined in this 
paper in order to clarify the use and interpretation of the SAT 'with hearing 
lmptreid~studentSr~— Speciiic_j;eco^^ 

1. Use- the Special Edition of the SAT, the SAT-HI, whi^F^lndQiizeiS 
floor and ceiling effects, standardizes administration and provides 
norms based on hearing Impaired students. 

2. Examine the content of the SAT-HI to determine its relevancy to 
the curriculum efforts. 

3. Use content classification analysis when appropriate. 
4; Do not Interpret the scores of hearing impaired students as precise 

estim.ates of ability.^ ■ 

5. ; Use the percentiles based-" on hearing impaired students to :access 
^ Student performance. ' . * ' ' . > ^'^ 7^ 

6. Use raw scores, when applicable, or scaled scoreis to assess stiident 
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